Network Troubleshooting from End-Hosts
نویسنده
چکیده
Troubleshooting faults or performance disruptions in today’s Internet is at best frustrating. When users experience performance or connectivity problems, there is little they can do. The most common attempt to solve the problem is to reboot or call the provider’s hot line. Network providers have more information to use in diagnosing problems in their networks, but their network troubleshooting is often mostly manual and ad-hoc. We argue that automatically troubleshooting network faults or performance disruptions requires monitoring capabilities deployed at end-hosts (in or close to the customer’s premises). End-host monitoring is necessary to detect the problems that affect end-users. In addition, when problems happen outside the control of the network administrator or the end-user performing the troubleshooting, end-host monitoring is the only approach to identify problem location. The goal of our research is to make network troubleshooting more transparent by designing tools that require as little as possible human involvement (both end-users and administrators). This document presents our initial steps towards automatic network troubleshooting from end-hosts as well as our long-term objectives. Our main contributions are the design of more accurate and efficient end-host measurement methods. We work both on techniques to detect faults and performance disruptions, and to identify the location of the problem. For fault identification, we improve the two basic techniques using end-to-end measurements: traceroute and network tomography. First, we show that traceroute, the most widely used diagnosis tool, reports erroneous paths in the presence of routers that perform load balancing. We build Paris traceroute to correct these errors. Second, we design measurement methods for accurate fault identification using network tomography. Network tomography assumes up-to-date and correlated measurements of the status of endto-end paths and of the network topology, which are hard to get in practice. We design techniques to track correlated path reachability and network topology. Finally, we design techniques for lightweight detection of faults and performance disruptions. We minimize the overhead of active end-to-end probing for fault detection and design a tool to collect passive measurements at end-hosts.
منابع مشابه
End-to-End Network/Application Performance Troubleshooting Methodology
The computing models for HEP experiments are globally distributed and grid-based. Obstacles to good network performance arise from many causes and can be a major impediment to the success of the computing models for HEP experiments. Factors that affect overall network/application performance exist on the hosts themselves (application software, operating system, hardware), in the local area netw...
متن کاملTowards Improved Control and Troubleshooting for Operational Networks
Over the past decade, operational networks, have grown tremendously in size, performance and importance. This concerns particularly the Internet, the ultimate “network of networks.” We expect this trend to continue as more and more services traditionally provided by the local computer move to the cloud, e.g., file storage services and office applications. In spite of this, our ability to contro...
متن کاملAn Instrumentation and Measurement Framework for End-to-End Performance Analysis linkbordercolor
This paper presents Periscope: a metric caching, analysis, and visualization platform for troubleshooting end-to-end network and system performance. Periscope collects and caches on-demand, real-time measurements from both global network monitoring infrastructures such as perfSONAR as well as locally generated host metrics in a common representation. We describe a flexible host monitoring imple...
متن کاملPeeking without Spying: Collecting End-Host Measurements to Improve User Experience
Obtaining data directly from end hosts could benefit a broad set of applications, including application performance diagnosis, network troubleshooting, security profiling and even energy-saving policies. However, the data central to the study and development of such tools is fundamentally hard to collect. This situation has arisen both for technical reasons and for psychological ones. We believ...
متن کاملDesigning an Expert System for Internet Connection Problems Troubleshooting for wired network users
Man, is living in an era that the knowledge is estimated to be doubled in a relatively short time. The fast rate of technology's growth in the "Century of information", is caused by fast growth of communication technologies like the internet which has become one of the best tools for a quick, cheap, effective and vastly supported communication. For an efficient and effective usage of tools and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010